On the Asymptotic Equivalence Between Differential Hebbian and Temporal Difference Learning

نویسندگان

  • Christoph Kolodziejski
  • Bernd Porr
  • Florentin Wörgötter
چکیده

In this theoretical contribution, we provide mathematical proof that two of the most important classes of network learning-correlation-based differential Hebbian learning and reward-based temporal difference learning-are asymptotically equivalent when timing the learning with a modulatory signal. This opens the opportunity to consistently reformulate most of the abstract reinforcement learning framework from a correlation-based perspective more closely related to the biophysics of neurons.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the asymptotic equivalence between differential Hebbian and temporal difference learning using a local third factor

In this theoretical contribution we provide mathematical proof that two of the most important classes of network learning correlation-based differential Hebbian learning and reward-based temporal difference learning are asymptotically equivalent when timing the learning with a local modulatory signal. This opens the opportunity to consistently reformulate most of the abstract reinforcement lear...

متن کامل

Mathematical Description of Differential Hebbian Plasticity and its Relation to Reinforcement Learning

The human brain consists of more than a billion nerve cells, the neurons, each having several thousand connections, the synapses. These connections are not fixed but change all the time. In order to describe synaptic plasticity, different mathematical rules have been proposed most of which follow Hebb’s postulate. Donald Hebb suggested in 1949 that synapses only change if pre-synaptic activity,...

متن کامل

Efficient Asymptotic Approximation in Temporal Difference Learning

in Temporal Difference Learning Frédérick Garcia and Florent Serre Abstract. TD( ) is an algorithm that learns the value function associated to a policy in a Markov Decision Process (MDP). We propose in this paper an asymptotic approximation of online TD( ) with accumulating eligibility trace, called ATD( ). We then use the Ordinary Differential Equation (ODE) method to analyse ATD( ) and to op...

متن کامل

Spike-Timing-Dependent Hebbian Plasticity as Temporal Difference Learning

A spike-timing-dependent Hebbian mechanism governs the plasticity of recurrent excitatory synapses in the neocortex: synapses that are activated a few milliseconds before a postsynaptic spike are potentiated, while those that are activated a few milliseconds after are depressed. We show that such a mechanism can implement a form of temporal difference learning for prediction of input sequences....

متن کامل

A generative model for Spike Time Dependent Hebbian Plasticity

Based on neurophysiological observations on the behavior of synapses, Spike Time Dependent Hebbian Plasticity (SDTHP) is a novel extension to the modeling of the Hebb Rule. This rule has enormous importance in the learning of Spiking Neural Networks (SNN) but its mecanisms and computational properties are still to be explored. Here, we present a generative model for SDTHP based on a simplified ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Neural computation

دوره 21 4  شماره 

صفحات  -

تاریخ انتشار 2009